Speech Separation using Non-negative Features and Sparse Non-negative Matrix Factorization
نویسنده
چکیده
This paper describes a method for separating two speakers in a single channel recording. The separation is performed in a low dimensional feature space optimized to represent speech. For each speaker, an overcomplete basis is estimated using sparse non-negative matrix factorization, and a mixture is separated by mapping the mixture onto the joint bases of the two speakers. The method is evaluated in terms of word recognition rate on the speech separation challenge data set.
منابع مشابه
Voice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملIterative Weighted Non-smooth Non-negative Matrix Factorization for Face Recognition
Non-negative Matrix Factorization (NMF) is a part-based image representation method. It comes from the intuitive idea that entire face image can be constructed by combining several parts. In this paper, we propose a framework for face recognition by finding localized, part-based representations, denoted “Iterative weighted non-smooth non-negative matrix factorization” (IWNS-NMF). A new cost fun...
متن کاملSingle-channel speech separation using sparse non-negative matrix factorization
We apply machine learning techniques to the problem of separating multiple speech sources from a single microphone recording. The method of choice is a sparse non-negative matrix factorization algorithm, which in an unsupervised manner can learn sparse representations of the data. This is applied to the learning of personalized dictionaries from a speech corpus, which in turn are used to separa...
متن کاملSingle-channel Speech Separation Using Orthogonal Matching Pursuit
In this paper, we propose a new sparse decomposition based single-channel speech separation method using orthogonal matching pursuit (OMP). The separation is performed using source-individual dictionaries consisting of time-domain training frames as atoms. OMP is used to compute sparse coefficients to estimate sources. We report the separation results of our proposed method and compare them wit...
متن کاملHeterogeneous Convolutive Non-Negative Sparse Coding
Convolutive non-negative matrix factorization (CNMF) and its sparse version, convolutive non-negative sparse coding (CNSC), exhibit great success in speech processing. A particular limitation of the current CNMF/CNSC approaches is that the convolution ranges of the bases in learning are identical, resulting in patterns covering the same time span. This is obvious unideal as most of sequential s...
متن کامل